Search Results: "Raphael Geissert"

28 August 2013

Raphael Geissert: Scheduling mistake

Sometimes people make mistakes and things don't go as expected. Eventually, they find out and say oops.

The next blog post for the a bashism a week series was scheduled for publishing on next Wednesday due to a mistake from my side. Sorry about that! given that it's a bit late today, let's better take a break this week.

21 August 2013

Raphael Geissert: A bashism a week: function names

The bashism of this week is easy to hit when overriding the execution of a command with a shell function. Think of the following example scenario:

Replacing the yes(1) command with a shell function:

$ exec /bin/bash
$ type -t yes
file
$ yes()   while :; do echo $ 1:-y ; done;  
$ type -t yes
function

Now every time yes is called the above-defined shell function will be called instead of /usr/bin/yes.

Apply the same principle to replace the run-parts(8) command with the following overly-simplified shell function:

$ run-parts()   
    if [ "$1" = "--test" ]; then
        shift;
        simulate=true;
    else
        simulate=false;
    fi;
    for f in "$1"/*; do
        [ -x "$f" ] && [ -f "$f" ]   continue;
        case "$(basename "$f")" in 
            *[!a-zA-Z0-9_/-]*)
                :
            ;;
            *)
                if $simulate; then
                    echo $f;
                else
                    $f;
                fi
            ;;
        esac;
    done
 
$ type -t run-parts
function

(note the use of negative matching)

It also works as expected. However, when running it under a shell that only supports the function names required by the POSIX:2001 specification it will fail. One such shell is dash, which aborts with a "Syntax error: Bad function name", another is posh which aborts with a "run-parts: invalid function name".

If you ever want to have function names with dashes, equal signs, commas, and other unusual characters make sure you use bash and ksh-like shells (and keep that code to yourself). Yes, you can even have batch-like silencing of stdout with

function @   "$@" > /dev/null ;

Update: there were missing quotation marks in the example @ function.

14 August 2013

Raphael Geissert: A bashism a week is back

After a while without posts on the a bashism a week series, it is coming back!

Next week, at the usual time and day of the week, the series of blog posts about bashisms will be back for at least one more month. Subscribe via Atom and don't miss any post and check all the previous posts.

The a bashism a week series cover some of the differences between bash and the behavior of other shells, and the requirements by the POSIX standard regarding shell scripting. Or put simply: they are a guide to common bashisms, allowing you to identify them and avoid their use for a more compatible and portable code.

Happy reading!

31 July 2013

Raphael Geissert: Ten years-old ebook reader

I own a ten years-old ebook reader.

Is it a smartphone?
It has features that make it pretty much like one, except that it can't make phone calls. So no. (However, I dare to say that making phone calls is one of the least used features of smartphones nowadays. Perhaps in the feature smartphones won't even be phones anymore, due to atrophy.)

What is it then?
It is a Sony CLI , a PEG-SJ22 to be more precise. A PDA running Palm OS 4.1 that I'm now using as an ebook reader. To my surprise, the Plucker and iSilo readers still exist and at least the latter seems somewhat alive - there is even a version for Android.

Nowadays there are ebook readers with electronic paper displays, wifi or 3G connectivity, and other features but they all come down to the same thing: an ebook reader. Truth be told, the technology is quite old. In 2004, a year after the release of the PEG-SJ22, Sony also released the LIBRI EBR-1000EP in Japan, an ebook reader with a 6" electronic paper display. A few years later is was released in the US as the Sony Reader.

Certainly, there have been advances since those devices were first released, but I have yet to see something that is really innovating. This year we are back to wrist watches, which were first released more than ten years ago - and one of them even ran linux.

Netbooks also had devices like the PEG-UX50 as their ancestor.

And if you were thinking about shoes, you've arrived late: the Puma RS already had some chips in them, and the Adidas 1 were also sported a few years ago. World, it's time to innovate.

9 July 2013

Raphael Geissert: Explaining segmentation fault errors

Want to fix that segfault you keep hitting or was reported to you? The first step is to understand the error message you get.

So you have a message like the following:
segfault at bfea3fec ip 080ee07e sp bfea3fa0 error 6

You might already know that ip means instruction pointer and sp means stack pointer and as such the addresses that follow them are the values in those registers. But what does the error number mean?

The error number, or code, actually gives you a better explanation of what the cause of the segfault is. The number's bits are flags describing the error and are architecture-dependent. For x86/x86_64 I just wrote an online converter/decoder that you can use to explain the segfault error code.

As an example, the above error code is explained as:

The cause was a user-mode write resulting in no page being found.

And the common error 4:

The cause was a user-mode read resulting in no page being found.

(also known as a null pointer dereference).

Enjoy.

7 July 2013

Raphael Geissert: Dealing with bashisms in proprietary software

Sometimes it happens that for one reason or another there's a need to use a proprietary application (read: can not be modified due to its licence) that contains bashisms. Since the application can not be modified and it might not be desirable to change the default /bin/sh, dealing with such applications can be a pain. Or not.

The switchsh program (available in Debian) by Marco d'Itri can be used to execute said application under a namespace where bash is bind-mounted on /bin/sh. The result:

$ sh --help
sh: Illegal option --
$ switchsh sh --help   head -n1
GNU bash, version 4.1.5(1)-release-(i486-pc-linux-gnu)

Simple, yet handy.

12 June 2013

Raphael Geissert: Service update: 5 million a day

After the release of Debian wheezy traffic jumped to about 1 million requests per day, but as the weeks have passed by traffic has continued to increase to 5 million requests every day.

Even though it is a new record for the redirector it can not yet be compared to openSUSE.org's 20-40 million on their mirrorbrain instance. Let's see how long it takes to get there.

User adoption has increased but it has yet to become the default mirror in several places.

5 June 2013

Raphael Geissert: The "let the tool do the work" update

Over the last few weeks I've been making several changes to http.debian.net to detect mirrors that don't follow Debian's mirroring guidelines and end up causing problems to the end users. The changes will mean less hash mismatches and similar errors.

As I wrote back in December, the redirector is becoming nicer but also stricter. Some of the changes I recently made caused over 30 mirrors to be completely disabled from the redirector. This is not ideal and I don't like having to disable mirrors. They are contributions afterall.

The only thing I can do, and can't stress enough, is encourage people to use an up to date ftpsync script (available at project/ftpsync/ on every mirror) to mirror Debian.
It takes care for you of all the little but important things needed. Really. A mirror that uses ftpsync is easier for the administrator to properly configure, and provides a consistent mirror for the benefit of the users.

Speaking of ftpsync, there is a new version! If you use ftpsync please upgrade it as soon as possible.

Other improvements are on their way. Contributions are welcome (if you like refactoring, there's quite a bit of explicitly-redundant code in check.pl that should now give a better idea of the way it needs to be refactored.)

8 May 2013

Raphael Geissert: Almost one million requests per day

In the first 48 hours after its log files were rotated last Sunday, http.debian.net handled almost 2 million requests, for an average of 11 requests per second.

In the last weeks before the release of Debian wheezy the number of requests had dropped slightly below 2 million per week.

Debian is alive.

4 May 2013

Raphael Geissert: A single address to get Debian Wheezy while it's hot

Already preparing to install or to upgrade to Debian Wheezy?

You can use the http.debian.net redirector to install Debian Wheezy or upgrade to it from Squeeze and make use of Debian's ever-growing 370-large mirrors network to get it.

APT one liner (to be used in your /etc/apt/sources.list file):

deb http://http.debian.net/debian wheezy main

During the installation process you can also choose to use it by manually entering http.debian.net as an HTTP mirror and /debian/ as the path.

Get it while it's hot!

2 May 2013

Raphael Geissert: An ever-growing mirrors network, a year later

A year ago I wrote about Debian's ever-growing mirrors network, so it is time to review the numbers.

Compared to the numbers from last year, today Debian is being served via http by about 370 mirrors world-wide, and is also available via ftp from 330 mirrors. So that's an increase of 40 mirrors in one year!

The number of countries with Debian mirrors also increased to 76, 3 more since last year.

This has only been possible thanks to the sponsors hosting the mirrors.
During this year some sponsors have had to retire their mirrors, sometimes ceasing years of contributions to the project and its community.

A big thanks is deserved to past and current sponsors.

8 April 2013

Raphael Geissert: How the world ended up in Costa Rica

Even though I haven't had much time to dedicate to http.debian.net lately, it has been up and running, or should I say serving?

Part of its job is to detect mirrors that have temporary issues or are entirely gone, down, unavailable. It does so, and many other things, by monitoring the so-called "trace files". A very important one being the "master" (or "origin") trace file.

With the recent integration of backports into the main archive, the master trace file of the backports mirrors also changed. Long story short, this change caused backports mirrors to no longer be considered by the mirror redirector as candidates. As long as they were up to date.

After the usual mirror synchronisation delay, more and more mirrors were disabled and subsets of "up to date" candidates re-calculated. This reached a critical point when only one mirror was left in the database. The mirror had not been synchronised for a couple of weeks.

This mirror is located in Costa Rica, and as the only candidate left in the database it was the only one used to serve requests for the backports archive. No matter where the client was located in the world.

The issue was later noticed and the necessary updates to the mirrors master list made. Mirrors started to be re-considered as they were re-checked (with some delay due to the rate limiter) and the subsets re-calculated. In a few hours everything was back to normality.

Correctness and fault-tolerance don't always get together very well...

29 March 2013

Raphael Geissert: Chocolate quote

Kinda appropriate for some recent events:

Il y a autant de g n rosit recevoir qu' donner

- Julien Green

Roughly translated to

There is only as much generosity to receive as there is to give.

27 March 2013

Raphael Geissert: A bashism a week: substrings (dynamic offset and/or length)

Last week I talked about the substring expansion bashism and left writing a portable replacement of dynamic offset and/or length substring expansion as an exercise for the readers.

The following was part of the original blog post, but it was too long to have everything in one blog post. So here is one way to portably replace said code.

Let's consider that you have the file name foo_1.23-1.dsc of a given Debian source package; you could easily find its location under the pool/ directory with the following non-portable code:

file=foo_1.23-1.dsc
echo $ file:0:1 /$ file%%_* /$file

Which can be re-written with the following, portable, code:

file=foo_1.23-1.dsc
echo $ file%$ file#? /$ file%%_* /$file

Now, in the Debian archive source packages with names with the lib prefix are further split, so the code would need to take that into consideration if file is libbar_3.2-1.dsc.

Here's a non-portable way to do it:

file=libbar_3.2-1.dsc
if [ lib = "$ file:0:3 " ]; then
    length=4
else
    length=1
fi

# Note the use of a dynamic length:
echo $ file:0:$length /$ file%%_* /$file

While here's one portable way to do it:

file=libbar_3.2-1.dsc
case "$file" in
    lib*)
        length=4
    ;;
    *)
        length=1
    ;;
esac

length_pattern=
while [ 0 -lt $length ]; do
    length_pattern="$ length_pattern ?"
    length=$(($length-1))
done

echo $ file%$ file#$length_pattern /$ file%%_* /$file

The idea is to compute the number of interrogation marks needed and use them where needed. Here are two functions that can replace substring expansion as long as values are not negative (which are also supported by bash.)

genpattern()  
    local pat=
    local i="$ 1:-0 "

    while [ 0 -lt $i ]; do
        pat="$ pat ?"
        i=$(($i-1))
    done
    printf %s "$pat"
 

substr()  
    local str="$ 1:- "
    local offset="$ 2:-0 "
    local length="$ 3:-0 "

    if [ 0 -lt $offset ]; then
        str="$ str#$(genpattern $offset) "
        length="$(($ #str  - $length))"
    fi

    printf %s "$ str%$ str#$(genpattern $length) "

Note that it uses local variables to avoid polluting global variables. Local variables are not required by POSIX:2001.

Enough about substrings!

Remember, if you rely on non-standard behaviour or feature make sure you document it and, if feasible, check for it at run-time.

20 March 2013

Raphael Geissert: A bashism a week: substrings

Sometimes obtaining a substring in a shell script is needed. The bashism of this week comes handy as it allows one to obtain a substring by indicating the offset and even the length of the substring. This is the $ varname:offset:length bashism, also known as substring expansion.

The portable "replacements" are simple if the offset (and the length) are static. For example, the following code would print the substring of "foo" consisting of only the last two characters:

var=foo
# Replace the bashism $ var:1  with:
echo $ var#?

The length can then be limited with additional pattern-matching removal expansions:

var="portable code"
# Replace the bashism $ var:3:5  with the following code

# Offset is 3, so we use three ? (interrogation) characters:
part=$ var#??? 

# Length is 5, so we use five ? characters:
echo $ part%$ part#?????

As it can be seen, it is not impossible to replace a substring expansion.

The portable code becomes slightly more complex if the offset and/or the length are dynamic. I leave that as an exercise for the readers.

Feel free to post your code as a comment (use the <pre> tags, please) or in another public way. My own response is already scheduled to be published next week at the same time as usual.

Note: substring expansions can also be replaced with a wide variety of external commands. This is a pure-POSIX shell scripting example.

13 March 2013

Raphael Geissert: A bashism a week: assigning to variables and special built-ins

Assigning a value to a variable when executing a command is a way to populate the command's environment, without the variable assignment persisting after the command completes. This is not true, however, when a special built-in is the command being executed.

POSIX:2001 states that "Variable assignments specified with special built-in utilities remain in effect after the built-in completes".

Not only this is tricky because it depends on whether a utility is a special built-in or not, but the bash interpreter does not respect that behaviour of the POSIX standard. That is, special built-ins are not so "special" to the bash interpreter.

This leaves two things to take into account when assigning to a variable when executing a command: whether the command is a special built-in, and whether bash is interpreting the script.

Now, the list of special built-ins is rather short and it would be a bit unusual to perform variable assignments when calling them, except for some cases: "exec", "eval", "." (dot), and ":" (colon).

It is important to note that ":" and "true" differ in this regard; the former is a special built-in, the latter is just a utility. Watch out for this kind of differences when using ":" or "true" to nullify a command. E.g.

Compare

$ dash -c '
method=sed
# some condition or user setting ends up making:
method=true
# later:
foo=bar $method
echo foo: $foo'
foo:

To (redacted for brevity):

$ dash -c '
method=:
foo=bar $method
echo foo: $foo'
foo: bar

6 March 2013

Raphael Geissert: A bashism a week: returning

Inspired by Thorsten Glaser's comment about where you can break from, this "bashism a week" is about a behaviour not implemented by bash.

return is a special built-in utility, and it should only be used on functions and scripts executed by the dot utility. That's what the POSIX:2001 specification requires.

If you return from any other scope, for example by accidentally calling it from a script that was not sourced but executed directly, the bash shell won't forgive you: it does not abort the execution of commands. This can lead to undesired behaviour.

A wide variety of shell interpreters silently handle such calls to return as if exit had been called.

An easy way to avoid such undesired behaviours is to follow the best practice of setting the e option, i.e.

set -e

. With that option set at the moment of calling return outside of the allowed scopes, bash will abort the execution, as desired.

The POSIX specification does not guarantee the above behaviour either as the result in such cases is "unspecified", however.

27 February 2013

Raphael Geissert: A bashism a week: appending

The very well known appending operator += is a bashism commonly found in the wild. Even though it can be used for things such as adding to integers (when the variable is declared as such) or appending to arrays, it is usually used for appending to a string variable.

As I previously blogged about it, the appending operator bashism is only useful when programming for the bash shell.

Whenever you want to append a string to a variable, repeating the name of the variable is the portable way. I.e.

foo=foo
foo="$ foo  bar"
# Instead of foo+=" bar", which is a bashism

See? Replacing the += operator is not rocket science.

Note: One should be aware that makefiles do have a += operator which is safe to use when appending to a make variable. But don't let this "exception" fool you: code in configure.ac and similar files is executed by the shell interpreter. So don't use the appending operator there.

25 February 2013

Raphael Geissert: A tale of a bug report

Part 1:
A bug report is filed.
Part 2:
A patch is later provided by the submitter.
Part 3:
The patch is added to the package, the bug gets fixed.

[some time later]

Part 4:
A new upstream version is released, the patch is dropped.
Part 5:
The bug report is filed, again.

20 February 2013

Raphael Geissert: A bashism a week: pushing and pop'ing directories

Want to switch back-and-forth between directories in your shell script?
The bashism of this week can be of some help, but for most needs, the cd utility is more than enough.

pushd, popd, and the extra built-in dirs are bashisms that allow one to create and manipulate a stack of directory entries. For a simple, temporary, switch of directories the following code is portable as far as POSIX:2001 is concerned:

cd /some/directory
  touch some files
  unlink others
  # etc
cd - >/dev/null
# We are now back at where we were before the first 'cd'

Which is equivalent to the following, also portable, code:

cd /some/directory
  touch some files
  unlink others
  # etc
cd "$OLDPWD"
# We are now back at where we were before the first 'cd'

Multiple switches can also be implemented portably without storing the name of the directories in variables at the expense of using subshells (and their side-effects).

However, if you think you can solve your problem more conveniently by using "pushd" and "popd" don't forget to document the need of those built-ins and to adjust the shebang of your script to that of a shell that implements them, such as bash.

Next.

Previous.